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Abstract. In static analysis, approximation is typically encoded by abstract do- 
mains, providing systematic guidelines for specifying approximate semantic func- 
tions and precision assessments. However, it may happen that an abstract do- 
main contains redundant information for the specific purpose of approximating 
a given semantic function modeling some behavior of a system. This paper in- 
troduces correctness kernels of abstract interpretations, a methodology for sim- 
plifying abstract domains, i.e. removing abstract values from them, in a maximal 
way while retaining exactly the same approximate behavior of the system un- 
der analysis. We show that, in abstract model checking and predicate abstraction, 
correctness kernels provide a simplification paradigm of the abstract state space 
that is guided by examples, meaning that it preserves spuriousness of examples 
(i.e., abstract paths). In particular, we show how correctness kernels can be inte- 
grated with the well-known CEGAR (CounterExample-Guided Abstraction Re- 
finement) methodology. 



1 Introduction 

In static analysis and verification, model-driven abstraction refinement has emerged 
in the last decade as a fundamental method for improving abstractions towards more 
precise yet efficient analyses. The basic idea is simple: given an abstraction modeling 
some observational behavior of the system to analyze, refine the abstraction in order 
to remove the artificial computations that may appear in the approximate analysis by 
considering how the concrete system behaves when false alarms or spurious traces are 
encountered. The general concept of using spurious counterexamples for refining an ab- 
straction stems from the CounterExample-Guided Abstraction Refinement (CEGAR) 
paradigm |4 5|. The model here drives the automatic identification of prefixes of the 
counterexample path that do not correspond to an actual trace in the concrete model, by 
isolating abstract (failure) states that need to be refined in order to eliminate that spuri- 
ous counterexample. Model-driven refinements, such as CEGAR, provide algorithmic 
methods for achieving abstractions that are complete (i.e., precise II14I18I ) with respect 
to some given property of the concrete model. 

We investigate here the dual problem of abstraction simplification. Instead of re- 
fining abstractions in order to eliminate spurious traces, our goal is to simplify an ab- 
straction A towards a simpler (ideally, the simplest) model As that maintains the same 
approximate behavior as A does. In abstract model checking, this abstraction simpli- 
fication has to keep the same examples of the concrete system in the following sense. 
Recall that an abstract path vr in an abstract transition system A is spurious when no 
real concrete path is abstracted to tt. Assume that a given abstract state space A of a 




system A gets simplified to As and thus gives rise to a more abstract system As- Then, 
we say that As keeps the same examples of A when the following condition is satisfied: 
if vr^^ is a spurious path in the simplified abstract system As then there exists a spurious 
path tta in the original system A that is abstracted to tt^ , . Such a methodology is called 
EGAS, Example-Guided Abstraction Simplification, since this abstraction simplifica- 
tion does not add spurious paths, namely, it does keep examples, since each spurious 
path in As comes as an abstraction of a spurious path in A. 

Let us illustrate how EGAS works through a simple example. Let us consider the 
abstract transition system A in Figure [T] where concrete states are numbers which are 
abstracted by blocks of the state partition {[1], [2, 3], [4, 5], [6], [7], [8, 9]}. The abstract 
state space of A is simplified by merging the abstract states [2,3] and [4,5]: EGAS 
guarantees that this can be safely done because pre''([2,3]) = {[1]} = pre''([4, 5]) 
and post'*([2, 3]) = {[6], [7]} ~ post'* ([4, 5]), where pre" and post' denote, respec- 
tively, the abstract predecessor and successor functions in A. This abstraction simplifi- 
cation leads to the abstract system A' in Figure [T] Let us observe that the abstract path 
TT = ([1], [2,3,4,5], [7], [8, 9]) in yi' is spurious because there is no concrete path whose 
abstraction in A' is tt, while tt is instead the abstraction of the spurious path ([1], [4, 5], 
[7], [8, 9]) in A. On the other hand, consider the path a = ([1], [2, 3, 4, 5], [6], [8, 9]) in 
A' and observe that all the paths in A that are abstracted to tt', i.e. ([1], [2, 3], [6], [8, 9]) 
and ([1], [4, 5], [6], [8, 9]), are not spurious. This is consistent with the fact that a actu- 
ally is not a spurious path. Likewise, A' can be further simplified to the abstract system 
A" where the blocks [6] and [7] are merged into a new abstract state [6, 7]. This trans- 
formation also keeps examples because now there is no spurious path in A". Let us also 
notice that if A would get simplified to an abstract system A'" by merging the blocks [1] 
and [2, 3] into a new abstract state [1, 2, 3] then this transform would not keep examples 
because we would obtain the spurious loop path r = ([1, 2, 3], [1, 2, 3], [1, 2, 3], ...) in 
A'" (because in A'" [1, 2, 3] has a self-loop) while there is no corresponding spurious 
abstract path in A whose abstraction in A'" is r. 

EGAS is formalized within the standard abstract interpretation framework by Cousot 
and Cousot |8 9|. This ensures that EGAS can be applied both in abstract model check- 
ing and in abstract interpretation. Consider for instance the following two basic abstract 
domains Ai and A2 for sign analysis of an integer variable, so that sets of integer num- 
bers in p(Z) is the concrete domain. 
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Recall that in abstract interpretation the best correct approximation of a semantic func- 
tion / on an abstract domain A that is defined through abstraction/concretization maps 
a/7 is given by /'^ = a o / o 7. Consider a simple operation of increment x++ on an 
integer variable x. In this case, the best correct approximations on Ai and A2 are as 
foUows: 

++^1 = {0 i-> Z>o, Z<o Z, Z>o Z>o, Z Z}, 
H-H--^^ = {Z>o ^ Z>o, Z Z}. 

We observe that the best correct approximations of ++ in Ai and A2 encode the same 
function, meaning that the approximations of ++ in Ai and A2 are equivalent, the latter 
being clearly simpler In fact, we have that ° ++^^ o aAi and ° ++'^^ ° ola^ 
are exactly the same function on p(Z). In other terms, the abstract domain A\ contains 
some "irrelevant" abstract values for approximating the increment operation, that is, 
and Z<o. This simplification of an abstract domain relatively to a semantic function 
is formalized in the most general abstract interpretation setting. This allows us to pro- 
vide, for generic continuous semantic functions, a systematic and constructive method, 
that we call correctness kernel, for simplifying a given abstraction A relatively to a 
given semantic function / towards the unique minimal abstract domain that induces 
an equivalent approximate behavior of / as in A. We show how correctness kernels 
can be embedded within the CEGAR methodology by providing a novel refinement 
heuristics in a CEGAR iteration step which turns out to be more accurate than the basic 
refinement heuristics 0. We also describe how correctness kernels may be applied in 
predicate abstraction-based model checking [1 1'191 for reducing the search space with- 
out applying Ball et al.'s [2J Cartesian abstractions, which typically yield additional loss 
of precision. 

This is an extended and revised version of the conference paper ifTT l that includes 
full proofs. 



2 Correctness Kernels 

As usual in standard abstract interpretation ||8|9| , abstract domains (or abstractions) 
are specified by Galois connections/insertions (GCs/GIs for short) or, equivalently, ad- 
junctions. Concrete and abstract domains, (C, <c) and (A, <^), are assumed to be 
complete lattices which are related by abstraction and concretization maps a : C — >■ A 
and 7 : yl C that give rise to an adjunction (a, C, A, 7), that is, for all a and c, 
a(c) <A a -i^ c <c 7(a)- It is known that /^^i = 7 o a : C C is an upper closure 
operator (uco) on C, i.e. a monotone, idempotent and increasing function. Also, abstract 
domains can be equivalently defined as ucos, meaning that any GI (a, C, A, 7) induces 
the uco ^A, any uco /j, : C — C induces the GI (/i, C, fJ,{C), Xx.x), and these two trans- 
forms are the inverse of each other. GIs of a common concrete domain C are preordered 



3 



w.r.t. their relative precision as usual: Si = (ai, C, Ai, 71) !^ S2 = (0^2, C*, A2, 72) — 
i.e. A1/A2 is a refinement/simplification of A2/A1 — iff 72(a2(C)) C ji{ai{C)). 
Moreover, Si and S2 are equivalent when Si E S2 and S2 E Si- We denote by 
Abs(C) the family of abstract domains of C up to the above equivalence. It is well 
known that (Abs(C), E) is a complete lattice, so that one can consider the most con- 
crete simphfication (i.e., lub U) and the most abstract refinement (i.e., gib n) of any fam- 
ily of abstract domains. Let us recall that the lattice of abstract domains (Abs(C), E) 
is isomorphic to the lattice of ucos on C (uco(C), E), where □ denotes the pointwise 
ordering between functions, so that lub's and gib's of abstractions can be equivalently 
characterized in uco(C). Let us also recall that each fi e uco(C) is uniquely deter- 
mined by its image img(/i) ~ iJ-iC) because fi = Xx. A{y € C \ y G /x(C), x < y}. 
Moreover, a subset X C C is the image of some uco on C iff X is meet-closed, i.e. 
X = C1a(X) ={Ar \Y CX} (note that Tc = A0 e CU{X)). Often, we will iden- 
tify ucos with their images. This does not give rise to ambiguity, since one can distin- 
guish their use as functions or sets according to the context. Hence, if A,B £ Abs(C) 
are two abstractions then they can be viewed as images of two ucos on C, denoted re- 
spectively by ija and y^s, so that A is more precise than B when img(/iB) C img(/i^). 

Let f : C ^ Che some concrete semantic function — for simpUcity, we consider 
1-ary functions — and let /" : A ^ Ahe a corresponding abstract function defined on 
some abstraction A S Abs(C). Then, {A, /") is a sound abstract interpretation when 
a o / C /f o a. Moreover, the abstract function f'^ = aofo'y:A—>-Ais called the 
best correct approximation (b.c.a.) of / on A because any abstract interpretation /") 
is sound iff /'^ □ /" . Hence, for any abstraction A, plays the role of the best possible 
approximation of / on A. 

2.1 The Problem 

Given a semantic function / : C — s- C on some concrete domain C and an abstraction 
A G Abs(C), does there exist the most abstract domain that induces the same best 
correct approximation of / as ^ does? 

Let us formalize the above question. Consider two abstractions A, B <E Abs(C). 
We say that A and B induce the same best correct approximation of / when and 
are the same function up to isomorphic representations of abstract values. If fiA and fj,B 
are the corresponding ucos then this boUs down to: 

Ha ° f o jjLA = IJ-B o f o hb. 

In order to keep the notation easy, this is denoted simply by = f^. Also, if F C 
C — > C is a set of concrete functions then = means that for any f G F, 
jA _ jB^ Hence, given A e Abs(C) and by defining 

As = U{B e Abs(C) I F'^ = F^} 

the question is whether F^' = F^ holds or not. This leads us to the following notion 
of correctness kernel. 
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Definition 2.1. Given F C C—^C define: %f ■ Abs(C) ^ Abs(C) as 

Xf{A) = U{B e Abs(C) I = F^}. 
jf jpXf(A) ^ pA ^gjj ^ ^gjjgjj correctness kernel of A for □ 

It is worth remarking that the dual question on the existence of the most concrete 
domain that induces the same best correct approximation of / as A has a negative 
answer, as shown by the following simple example. 

Example 2.2. Consider the lattice C depicted below. 

5 

/ \ 
3 4 
\ / 
2 

I 

1 

Let us also consider the monotonic function / : C — t- C defined as/ = {li-)-l,2i-> 
1, 3 M> 5, 4 ^ 5, 5 I— 7- 5} and the abstraction ji £ uco(C) whose image is /i = {1, 5}. 
Let us observe that jio f o ijl = {1 ^ 1, 2^ 5, 3i~>5, 4i-^5, 5^->5}. Consider now 
the abstractions pi = {1, 3, 5} and /92 — {1,4, 5} and observe that pio/opi = po/op. 
However, we have that pi n p2 = Ax..t, because the image of pi n p2 is M(pi U P2) = 
{1, 2, 3, 4, 5}. Hence, (pi n P2) 0/0 (pi n P2) = /7^Mo/op- Therefore, if we let 
Pr = n{p G uco(C) \ p°fop = tJ'°f°IJ'} then p^ = Xx.x. Consequently, the most 
concrete domain that induces the same best correct approximation of / as does not 
exist. □ 

2.2 Tlie Solution 

Our key technical result is the following constructive characterization of the property of 
"having the same b.c.a." for two comparable abstract domains. In the following, given 
a poset A and any subset S C A, max(5') = {x G S \ \/y G S. x <a y ^ x = y} 
denotes the set of maximal elements of 5 in A. 

Lemma 2.3. Let f : C ^ C and A,B G Abs(C) such that B Q A. Suppose that 
f o fiji : C ^ C is continuous (i.e., preserves tub's of chains in C). Then, 

fB=fA ^ img(/^) UU^^Amax({a; e A \ f\x) <a y}) C B. 

Proof. Let p and p be the ucos induced by, respectively, the abstractions A and B, so 
that p E p. Then, observe that img(/^) = p(/(p(C))) and {a; G A | f^{x) <a y} = 
(p o / o p)~^(4, ?/). We therefore prove the following equivalent statement which is 
formalized through ucos: 

po/op = po/op iff n{f{ii{C))) U Uy£^ max((p 0/0 p)-i(4,y)) C p. 

Let us first prove that 

po/op = po/op<S=>po/op = po/op = po/op (*) 
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(=^) On the one hand, 



/io/o/i = po/op = 

po/io/o/i = popofop = 

pofofi = pofop = 
po/o^ = /io/o/i 

and on the other hand, 

/io/o/i = po/op=> 
p,ofofiop^pofopop^ 
pofop^pofop^ 

so that pofoj_i = pofop = fiofop. 
(<^) We have that: 

pofop = pofoi_i = l_iofop^ 
popo/o/i = po/io/op=> 
po f o p = po f o p^ 
p o f o = p o f o p. 

Let us now observe that pofop = pofop: in fact, since p — po p,, this is equivalent 
topop,ofop, = pofop, which is obviously equivalent to p{f{p{C))) C p. 
Since p = /i o p, we have that pofop = pofopis equivalent to /i o (/ o /i) = 
/i o (/ o /i) o p. By the characterization of completeness in 1 18, Lemma 4.2], since, by 
hypothesis, f o pis continuous, we have that the completeness equation p o {f o p) = 
/io (/op) op is equivalent to U^e^ max((/op)^^ (J,?/)) C p, which is in turn equivalent 
to Uj,e^max((po/op)"i(|y)) C p. 
Summing up, we have thus shown that 

po/op = po/op = po/op ^ p(/(p(C)))UUj;e^max((po/op)-i(|y)) C p 
and this, by the above property (*), implies the thesis. □ 

It is important to remark that the above proof basically consists in reducing the 
equality — between b.c.a.'s to a standard property of completeness of the ab- 
stract domains A and B for the function / and then in exploiting the constructive char- 
acterization of completeness of abstract domains by Giacobazzi et al. ITSl Section 4]. In 
this sense, the proof itself is particularly interesting because it provides an unexpected 
reduction of best correct approximations to a completeness problem. 

As a consequence of Lemma 12.31 we obtain the following constructive result of 
existence for correctness kernels. Recall that if X C A then C1a(-'^) denotes the gib- 
closure of X in A, while Clv(^) denotes the dual lub-closure. 



[by applying p to both sides] 
[because p o p = p and po p — p] 



[by applying p in front to both sides] 
[because po p — p and p o p = p] 



[by applymg p to both sides] 
[since po p = p and p o p ~ p] 
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Theorem 2.4. Let A e Abs(C) and F Q C ^ C such that, for any / G F, / o is 
continuous. Then, the correctness kernel of A for F exists and it is 

Xf{A) = CU ( U img(/-4) U U,eimg(/-) max({x G A \ f^{x) = y})). 

Proof. Let fjL = jiA- We prove the following equivalent statement formalized through 
ucos:C1a (U/eFUye^(/(^(c))) ({y} U max({a; e /i | ^i{f{x)) = y}))) is the correct- 
ness kernel of jj. for F. 

Letp^ ^ CU (/i(/(A^(C)))UU.yg^max((Aio/o;i)-i(;y))). By LemmaB we 
have that U{/9 G uco(C) ii, p o f o p = p o f o p] = p^. Since U{p G 

uco(C) p°f°p — p°f°p} — LJ{p G uco(C) I po f o p ~ po f o /i}, as a 

consequence we also have that p^ is the correctness kernel of p for F. 
Therefore, let us prove that 

CU (m(/(m(C))) U Uye^ max((Ai o / o /i)"'ay))) = 

(/(m(C))) ({2/} U max({a; G | p{f{x)) - y})) j . 

Let us first observe that for any y € p, if z € max((/i 0/0 p)^^(ly)) then z € p: in 
fact, p{f{p{p(z)))) — p{f{p{z))) < y, so that from z < p{z), by maximality of z, 
we get z = 

(C) : Consider y E p and z G max((/i 0/0 /i)"^(4, y)). Then, it turns out that 
z G max({a; G p \ p{f{x)) = p{f{p{z)))}). In fact, since z = p{z), we have that 
p{f{z)) = p{f{p{z))). Moreover, if u G {a; G /i | p{f{x)) = p{f{p{z)))} and z <u 
then p{f{p{u))) = p{f{u)) — p{f{p{z))) < y, so that, by maximality of 2, 2 = u, 
i.e., z G max({a; G p \ p{f{x)) = p{f{p{z)))}). 

(D) : Consider y = p{f{p{w))) and 2 G max({a; G p \ p{f{x)) = y}). Then, 
p{f{p{z))) = = 2/ so that 2 G (Aio/o^)-i(;y). jf ^ ^ (a^ o / o 

and 2 < u then p{f(p{z))) < p{f{p{u))) <y = p{f{p{w))) = p{f{p{z))). Hence, 
since z <u < p{u) and by maximality of 2, we have that 2 = p{u), and in turn z = u. 
Thus, 2 G max((/i o / o /i)~i(4,j/)). □ 

Example 2.5. Consider sets of integers (p(Z), C) as concrete domain domain and the 
square operation sq : p(Z) — > p(Z) as concrete function, i.e., sq{X) = {x^ \ x G X}, 
which is obviously additive and therefore continuous. Consider the abstract domain 
Sign G Abs(p(Z)c), depicted in the following figure, that represents the sign of an 
integer variable. 

Z 

/ I \ 

Z<o Z^o Z>o 

I \ ^ I 
Z<o Z>o 
\ I / 
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Sign induces the following best correct approximation of sq: 



SgSign = {0^0^ ^ ZyQ,0 ^ 0, Z>o ^ Z>o, Z<o ^ Z>o, 
Z^o ^ 2>o, Z>o ^ Z>o, Z Z>o}. 

Let us characterize the correctness kernel 3C5g(Sign) by Theorem 12.41 We have that 
img(sgS's") = {0, Z>o, 0, Z>o}. Moreover, 

max({a; £ Sign | sq^'^'^{x) = 0}) = {0} 
max({a; £ Sign | sg^'^n^^) ^ ^ |^_^^} 

max({x e Sign | = 0}) = {0} 

max({a; e Sign | sg^'^n^^.) ^ ^^^i) ^ |^} 

Therefore, Uj,einW3aS.«») max({x G Sign 1 sg'^'i^'^ix) = y}) = {0,Z^o,O,Z} so 
that, by Theorem [2.4l 

3<:s9(Sign) = Cln({0,Z>o,O,Z>o,Z^o,Z}) = Sign \{Z<o, Z<o}. 

Thus, it turns out that we can safely remove the abstract values Z<o and Z<o from 
Sign and still preserve the same b.c.a. as Sign does. Besides, we cannot remove further 
abstract elements otherwise we do not retain the same b.c.a. as Sign. For example, this 
means that Sign-based analyses of programs like 

X :— k; while condition do x := x * x; 

can be carried out by using the simpler domain Sign \{Z<o, Z<o}, yet providing the 
same input/output abstract behavior. □ 



It is worth remarking that in Theorem l2.4l the hypothesis of continuity is crucial for 
the existence of correctness kernels and this is shown by the following example. 

Example 2.6. Let us consider as concrete domain C the w + 2 ordinal, i.e., C = {x G 
Ord I X < aj}U{cj, uj + 1}, and let / : C — J- C be defined as follows; 



LJ if a; < w; 
a; + 1 otherwise. 



Let n e uco(C) be the identity Xx.x uco, so that n o f o n = /. For any k > 0, 
consider pk G uco(C) defined as pk = C \ [0,k[ and observe that for any k, we have 
that Pk.o f ° Pk = f = f ° f^- However, it turns out that \Jk>oPk = <^k>o inig(pA:) = 
{uj,uj + 1}. Hence, {Uk>oPk) ° / ° (Llfe>oPfc) = As.w + 1 ^ /i o / o Hence, the 
correctness kernel of fi for / does not exist. Observe that p. o f = / is clearly not 
continuous and therefore this example is consistent with Theorem |2.4| □ 
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3 Correctness Kernels in Abstract Model Checking 



Following the approach by Ranzato and Tapparo f2Q\, partitions of a state space S can 
be viewed as particular abstract domains of the concrete domain p{S). Let Part(Z') 
denote the set of partitions of S. Given a partition P G Part(Z'), the corresponding 
set of (possibly empty) unions of blocks of P, namely p(-P), is viewed as an abstract 
domain of p{S) by means of the following Galois insertion {ap, p{S)c, p(P)c, 7p): 

ap{S) = {BeP\Br\S^0} and 7p(S) = Use'sB. 

Hence, the abstraction ap{S) provides the minimal over-approximation of a set S of 
states through blocks of P. 

Consider a transition system S = {S, and a corresponding abstract transition 
system A = (P,-)-"^) defined over a state partition P £ Part(Z')Q Fixpoint-based 
verification of a temporal specification on the abstract model A relies on the com- 
putation of least/greatest fixpoints of operators which are defined using Boolean con- 
nectives (union, intersection, complementation) on abstract states and abstract succes- 
sor/predecessor functions post'/pre'* on the abstract transition system {P, ^'}. The key 
point here is that successor/predecessor functions are defined as best correct approxi- 
mations on the abstract domain P of the corresponding concrete successor/predecessor 
functions. In standard abstract model checking |,l,6.7j, the abstract transition relation is 
defined as the existential/existential relation between blocks of P, namely: 

B C iff 3x e B3y e C. x y 
post33(B) ^{CeP\B c}; pre33(C) ^ {B e P \ B c}. 

As shown in [i20|, it turns out that pre^^ and post^^ are the best correct approximations 
of, respectively, pre and post functions on the above abstraction {ap, p{E)c, p(P)c,7p)- 
In fact, for a block C E P,we have that 

ap(pre(7p(C7))) = {B e P \ B D pre(C) ^ 0} ^ pre^'^iC) 

and an analogous equation holds for post. We thus have that prc^^ = ap o pre 07P 
and post^^ = ap o post ojp. 

This abstract interpretation-based framework allows us to apply correctness kernels 
in the context of abstract model checking. The abstract transition system A = (P, 
is viewed as an abstract interpretation defined by the abstract domain (ap, p{S)c, 
p(P)c,7p) and the corresponding abstract functions pre^^ = ap o preo7p and 
post^^ — ap o posto7p. Then, the correctness kernel of the abstraction p(P) for 
the concrete predecessor/successor {pre, post}, that we denote simply by X^{P) (by 
Theorem 12.41 this clearly exists since pre and post are additive functions), provides a 
simplification of the abstract domain p(P) that preserves the best correct approxima- 
tions of predecessor and successor functions. This simplification 3C^(P) of the abstract 
state space P works as follows: 

' Equivalently, the abstract transition system A can be defined over an abstract state space A 
determined by a surjective abstraction function h : S ^ A. 
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Corollary 3.1. IX!^(P) merges two blocks Bi,B2 £ P if and only if for any A ^ P, 

A -^33 Bi ^ A B2 andBi ^^3 A <^ B2 A. 

Proof. By Theorem 12.41 we have that the kernel of the abstraction p(P) G 
Abs(p(Z')) for pre and post is as follows: 

X^iP) = Cln (img(pre33) U Ueeimg(prc^3) U{2? £ p(P) | pre33(e) = 03} 

|Jimg(post33) uU3eimg(po.t3B) U{e e p{P) I post33(S) = 6}). 

Let us observe that both b.c.a.'s pre^g, postgg : p{P) p{P) are additive functions, 
so that for any 6 G img(pregg), U{'B e p(P) | prc^^(e) = 23} G inig(pregg) and 
for any 23 G im g(post33), U{e G p(P) 1 post^^CB) = e} G inig(post33). Moreover, 
(P) is closed under arbitrary unions. Hence, the kernel can be simplified as follows: 

X^{P) - CWu{{pre^H{C}) | C G P} U {post33({P}) | B G P}). 

We therefore have that a block P G P is merged together with all the blocks B' £ P 
such that for any block A G P, P G pre^^({A}) ^ P' G pre^^({A}) and B G 
post^^({A}) ^ B' e post^^({A}). Thus, the thesis follows. □ 

Example 3.2. Reconsider the abstract transition system A in Section [T] and let P = 

{[1], [2, 3], [4, 5], [6], [7], [8, 9]} be the underlying state partition. In this case, we have 
that 

img(pre33) = Qu ({{[1]}, {[2, 3], [4, 5]}, {[6], [7]}}) , 
img(post33) = Clu ({{[2, 3], [4, 5]}, {[6], [7]}, {[8, 9]}}). 

Hence, by Corollarv 13.11 in the correctness kernel DC_,(P) the block [2, 3] is merged 
with [4,5] while [6] is merged with [7]. This therefore simplifies the partition P to 
P" = {[!]) [2, 3, 4, 5], [6, 7], [8, 9]}, that is, we obtain the abstract transition system A" 
in Section [1] □ 

4 Example Guided Abstraction Simplification 

Let us discuss how correctness kernels give rise to an Example-Guided Abstraction 
Simplification (EGAS) paradigm in abstract transition systems. 

Let us first recall some basic notions of CEGAR II4I5I . Consider an abstract tran- 
sition system A — (P, ^^^) defined over a state partition P G Part(Z') and a finite 
abstract path tt = (Pi , P„) in A, where each Bi is a block of P. Typically, this is a 
path counterexample to the validity of a temporal formula that has been given as output 
by a model checker (for simplicity we do not consider here loop path counterexamples). 
The set of concrete paths that are abstracted to tt are defined as follows: 

paths(7r) = {(si,...,s„) G i7" | Vi G [l,n].s,; ^ Bi ^ [l,n).s,; s.^+i}. 

The abstract path tt is spurious when it represents no real concrete path, i.e., when 
paths(7r) — 0. The sequence of sets of states sp(7r) — {Si,...,Sn) is inductively 



10 




Fig. 2. Some abstract transition systems. 

defined as follows: 5*1 = Si+i = post(S'i) fl -Bi+i. As shown in ||5l, it turns out 
that TT is spurious iff there exists a least fc e [1, n — 1] such that Sk+i = 0- In such 
a case, the partition P is refined by splitting the block Bk- The three following sets 
partition the states of the block Bk'- 

dead-end states: B^""'' = Sk ^ 

bad states: B^-"^ = Bk n pre(Bfc+i) 

irrelevant states: = \ U B^.^'^) 

The split of the block Bk must separate dead-end states from bad states, while irrelevant 
states may be joined indifferently with dead-end or bad states. However, the problem 
of finding the coarsest refinement of P that separates dead-end and bad states is NP- 
hard O and thus some refinement heuristics are used. According to the basic heuristics 
of CEGAR Section 4], Bk is simply split into and B^,"^ U B^". 

Let us see a simple example. Consider the abstract path vr — {[1], [345], [6]) in 
the abstract transition system A depicted in Figure |2] This is a spurious path and the 
block [345] is therefore partitioned as follows: [5] dead-end states, [3] bad states and 
[4] irrelevant states. The refinement heuristics of CEGAR tells us that irrelevant states 
are joined with bad states so that A is refined to the abstract transition system A'. In 
turn, consider the spurious path tt' ~ { [2] , [34] , [6] ) in 71', so that CEGAR refines A' to 
A'" by splitting the block [34]. In the first abstraction refinement, let us observe that if 
urelevant states would have been joined together with dead-end states rather than with 
bad states we would have obtained the abstract system A", and A" does not contain 
spurious paths so that it surely does not need to be further refined. 

EGAS can be integrated within the CEGAR loop thanks to the following remark. If 
TTi and TT2 are paths, respectively, in (Pi, -^^^) and {P2, ^^^), where Pi, P2 G Part(Z') 
and Pi is finer than P2, i.e. Pi < P2, then we say that tti is abstracted to tt2, denoted 
by TTi C TT2, when length(7ri) = length(7r2) and for any j e [1, length(7ri)], 7ri(j) C 
7^2 (j). 
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Corollary 4.1. Consider an abstract transition system A = {P, -»^^) over a partition 
P e Part(Z') and its simplification As = (3C^(P), -^^^) induced by the correctness 
kernel %^{P). If t: is a spurious abstract path in As then there exists a spurious ab- 
stract path tt' in A such that tt' C tt. 

Proof. This is a simple consequence of Corollary 13.11 Let tt — {Bi, ...,i?„), where 
Bi G !X!^(P), and let Bk be the block of tt that causes the spuriousness of tt. Then, 
for each i G [l,n], we have that Bi = UC^% where C^' e P. By Corollary 13.11 
for each i e and ji, post^^{Bi) = post^^(C^') and for each i and ji, 

pie^^{Bi) = pre^^(C^'). Then, in order to define the path tt' in A, for i e [1, n], one 
can choose any block C|' in P such that C^' C Bi. The key point to note here is that the 
definition of the correctness kernel (P) guarantees that Cl'' causes the spuriousness 
of tt' and that tt' □ tt. □ 

Thus, it turns out that the abstraction simplification induced by the correctness 
kernel does not add spurious paths. These observations suggest us a new refinement 
strategy within the CEGAR loop. Let tt ~ {Bi, Bn) be a spurious path in A and 
sp(7r) = (5*1, Sn) such that Sk+i — for some minimum fc G [1, ?^ — 1], so that the 
block Bk needs to be split. The set of irrelevant states B^ is partitioned as follows. We 
first define the subset of bad-irrelevant states B)^'^''" . Let pre^^(i?^'"') = {Ai, A^} 
and post33(B^."'') = {Ci, C;}. Then, we define: 

^bad-in- A ( post(yli U ... U A^) H pre(Ci U ... U G)) n 

The underlying idea is simple: B^.'"* '" contains the irrelevant states that: (1) can be 
reached from a block that reaches some bad state and (2) reach a block that is also 
reached by some bad state. By Corollary 14.11 it is therefore clear that by merging 
^bad-uT ^jj^ ^bad spurious path is added w.rt. the abstract system where they are 
kept separate. The subset of dead-irrelevant states iJ^'^^d-nT analogosly defined: If 
pre^^(Bf'"*) = {Ai,...,AJandpost33(Bdead) ^ {C^, then 

^dead-„T A (post(Ai U ... U Aj) H prc(Ci U ... U Cl)) n 

It may happen that: (A) an irrelevant state is both bad- and dead-urelevant; (B) an 
irrelevant state is neither bad- nor dead-irrelevant. From the viewpoint of EGAS, the 
states of case (A) can be equivalently merged with bad or dead states since in both 
cases no spurious path is added. On the other hand, the states of case (B) are called 
fiilly-irrelevant because EGAS does not provide a merging strategy with bad or dead 
states. For these states, one could use, for example, the basic refinement heuristics of 
CEGAR that merge them with bad states. 

In the above example, for the spurious path ([1], [3, 4, 5], [6]) in A, the block B = 
[3, 4, 5] needs to be refined: 

B^"^ = [3], B'^''"^ = [5], B'" = [4]. 

Here, 4 is a dead-irrelevant state because pre^^([5]) = {[1], [2]}, post^^([5]) = {[7]} 
and (post([l] U [2]) n pre([7])) n [4] = {4}. Hence, according to the EGAS refinement 
strategy, the dead-irrelevant state 4 is merged in A" with the dead-end state 5. 
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int X, y, z, w; 

void fooQ { 
do{ 

z -.— 0; X := y\ 

if (w) { a;++; z 1; } 
} wliile (!(a:: = y)) 
if(z) 

assert(O); // (*) 

} 

Fig. 3. An example program. 



5 Correctness Kernels in Predicate Abstraction 

Let us discuss how correctness kernels can be also used in the context of predicate 
abstraction-based model checking 111 11191 . Following Ball et al.'s approach |2|, predi- 
cate abstraction can be formalized by abstract interpretation as follows. Let us consider 
a program P with k integer variables xi,...,Xk- The concrete domain of computation of 
P is (p(Statcs), C) where States = {xi, ...yXk} Values in States are denoted 

by tuples (zi, Zk) G Z*^. The program P generates a transition system (States, ->) 
so that the concrete semantics of P is defined by the corresponding successor function 
post : p(States) — > p(States). 

A finite set CP = {pi, ...,p„} of state predicates is considered, where each predi- 
cate Pi denotes the subset of states that satisfy pi, i.e. {s G States \ s \^ pi}. These 
predicates give rise to the so-called Boolean abstraction B = (p({0, 1}"), C) which is 
related to p(States) through the following abstraction and concretization maps (here, 
s 1= Pi is understood in {0, 1}): 

aB{S) ^ {{s h Pi, s h Pn) e {0, 1}" I s e S}, 

iBiV) = {s e States | {s h Pi, s \= p„) e V}. 

These functions give rise to a disjunctive (i.e., 7 preserves lub's) Galois connection 
(as, p(States)c, p({0, l}")c,7s)- 

Verification of reachability properties based on predicate abstraction consists in 
computing the least fixpoint of the best correct approximation of post on the Boolean 
abstraction B, namely, post^ ^ ub o post 075. As argued in Q, the Boolean ab- 
straction B may be too costly for the purpose of reachability verification, so that one 
usually abstracts B through the so-called Cartesian abstraction. This latter abstraction 
formalizes precisely the abstract post operator computed by the verification algorithm 
of the c2bp tool in SLAM |3|. However, the Cartesian abstraction of B may cause a 
loss of precision, so that this abstraction is successively refined by reduced disjunc- 
tive completion and the so-called focus operation, and this formalizes the bebop tool in 
SLAM 12J. 
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Let us consider the example program in Figure [3] taken from |2l, where the goal 
is that of verifying that the assert at line (*) is never reached, regardless of the con- 
text in which /oo() is called. Ball et al. 121 consider the following set of predicates 
CP = {pi = {z = 0),P2 = {x = y)} so that the Boolean abstraction is i? = 
p({(0, 0), (0, 1), (1, 0), (1, l)})c- Clearly, the analysis based on B allows to conclude 
that line (*) is not reachable. This comes as a consequence of the fact that the least 
fixpoint computation of the best correct approximation post^ for the do-while loop 
provides as result {(0, 0), (1, 1)} G B because: 

^^^^ {(1,1)} "'"^"^^^^ ■^ {(i,i)}u{(o,o)} 

where, according to a standard approach, the guard of the if statement is simply ignored. 
Hence, at the exit of the do-while loop one can conclude that 

{(1, 1), (0, 0)} = {(1, 1), (0,0)} n {(o, i), (i, i)} = {(i, i)} 

holds, hence pi is satisfied, so that z — and therefore line (*) can never be reached. 

Let us characterize the correctness kernel of the Boolean abstraction B. Let Si = 
z := 0; X := y and S2 = x++; z := 1. The best correct approximations of post 5^ and 
postg^ on the abstract domain B turn out to be as follows: 

as o posts^ = {(0, 0) ^ {(1, 1)}, (0, 1) ^ {(1, 1)}, (1, 0) ^ {(1, 1)}, 

(1,1) ^{(1,1)}} 

as o posts^ = { (0, 0) ^ {(0, 0), (0, 1)}, (0, 1) ^ {(0, 0)}, 
(1,0)k^{(0,0),(0,1)},(1,1)^{(0,0)}} 

Thus, we have that img(a_B opost^^ °Jb) — {{(1, 1)}} and img(aB opost^^ °7b) = 
{{(0,0), (0,1)}, {(0,0)}} so that 

max {{VeB\ aBiposts^hsiV))) = {(1, 1)}}) - {{(0, 0), (0, 1), (1, 0), (1, 1)}} 
max {{VeB\ asiposts^hBiV))) = {(0, 0), (0, 1)}}) = {{(0, 0), (0, 1), (1, 0), (1, 1)}} 
max {{VeB\ aB{vosts,{jB{V))) - {(0, 0)}}) = {{(0, 1), (1, 1)}} 

Hence, by Theorem |2.4| the kernel 3Cp{B) of B for F = {post^^ , post^^ } is: 

Cln ( Clu ({{(0, 0)}, {(1, 1)}, {(0, 0), (0, 1)}, {(0, 1), (1, 1)}, 

{(0, 0), (0, 1), (1, 0), (1, 1)}})) = Clu ({{(0, 0)}, {(0, 1)}, {(1, 1)}}) 

where we observe that the set {(0, 1)} is obtained as the intersection {(0, 0), (0, 1)} n 
{(0,1), (1,1)}. This correctness kernel 'Kp{B) can be therefore represented as 

(p({(0, 0), (0, 1), (1, 1)}) U {(0, 0), (0, 1), (1, 0), (1, 1)}, C). 

Thus, it turns out that 'Xp{B) is a proper abstraction of the Boolean abstraction B that, 
for example, is not able to express precisely the property pi A -^p2 = {z — 0) A{x ^ y). 
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It is interesting to compare this correctness kernel 'Xp{B) with Ball et al.'s ^ 
Cartesian abstraction of B. The Cartesian abstraction is defined as 

c^({o,i,*ru{±c},<) 

where < is the componentwise ordering between tuples of values in {0, 1, *} ordered 
by < * and 1 < * (J-c is a bottom element that represents the empty set of states). 
The concretization function 7c : C ^ p(States) is as follows: 

7c((wi, ■■-,««)) ^ {s e States | (s \= pi, ...,s |= p„) < (wi, ...,v„)}. 

It turns out that these two abstractions are not comparable. For instance, (1,0) G C 
represents pi A -ip2 which is instead not represented by %f{B), while {(0, 0), (1, 1)} G 
%f{B) represents (-ipi A ^^2) V (pi A P2) which is not represented in C. However, 
while the correctness kernel guarantees no loss of information in analyzing the program 
P (and therefore the analysis with 'Kp{B) concludes that (*) cannot be reached), the 
analysis of P with the Cartesian abstraction C is inconclusive because: 

±c ^ (1, 1) — ^ (0, 0) Vc (1, 1) = *) 

where 7c((*, *)) = States, so that at the exit of the do-while loop one cannot infer 
with C that line (*) is unreachable. 

6 Related and Future Work 

Few examples of abstraction simplifications are known. A general notion of domain 
simplification and compression in abstract interpretation has been introduced in II12I151 
as a formal dual of abstraction refinement. This duality has been further exploited in 
lfT3l to include semantics transformations in a general theory for transforming abstrac- 
tions and semantics based on abstract interpretation. Our domain transformation does 
not fit directly in this framework. Following |15|, given a property T of abstract do- 
mains, the so-called core of an abstract domain A, when it exists, provides the most 
concrete simplification of A that satisfies the property T, while the so-called compres- 
sor of A, when it exists, provides the most abstract simplification of A that induces the 
same refined abstraction in T as A does. Examples of compressors include the least 
disjuctive basis |[T6l , where T is the abstract domain property of being disjunctive, and 
examples of cores include the completeness core 1 18|, where CP is the domain property 
of being complete for some semantic function. The correctness kernel defined in this 
paper is neither an instance of a domain core nor an instance of a domain compression. 
The first because, given an abstraction A, the correctness kernel of A characterizes the 
most abstract domain that induces the same best correct approximation of a function 
/ on A, whilst the notion of domain core for the domain property 7 a of inducing the 
same b.c.a. as A would not be meaningful, as this would trivially yield A itself. The 
second because there is no (unique) maximal domain refinement of an abstract domain 
which induces the same property Ta- 

The EGAS methodology opens some stimulating directions for future work, such as 
(1) the formalization of a precise relationship between EGAS and CEGAR and (2) an 
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experimental evaluation of the integration in the CEGAR loop of the EGAS-based re- 
finement strategy of Section |4] It is here useful to recall that some work formalizing 
CEGAR in abstract interpretation has akeady been done f 10' 141. On the one hand, fl4l 
shows that CEGAR corresponds to iteratively compute a so-called complete shell 1, 1 8l 
of the underlying abstract model A with respect to the concrete successor transformer, 
while flOl formally compares CEGAR with an abstraction refinement strategy based on 
the computations of abstract fixpoints in an abstract domain. These works can therefore 
provide a starting point for studying the relationship between EGAS and CEGAR in a 
common abstract interpretation setting. 
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